A Novel Data Driven Algorithm for Tamil Morphological Generator
نویسنده
چکیده
Tamil is a morphologically rich language with agglutinative nature. Being agglutinative language most of the word features are postpositionally affixed to the root word. The morphological generator takes lemma, POS category and morpho-lexical description as input and gives a word-form as output. It is a reverse process of morphological analyzer. In any natural language generation system, morphological generator is an essential component in post processing stage. Morphological generator system implemented here is based on a new algorithm, which is simple, efficient and does not require any rules and morpheme dictionary. A paradigm classification is done for noun and verb based on Dr.S.Rajendran’s paradigm classification. Tamil verbs are classified into 32 paradigms with 1884 inflected forms. Like verbs, nouns are classified into 25 paradigms with 325 word forms. This approach requires only minimum amount of data. So this approach can be easily implemented to less resourced and morphologically rich languages. General Terms Data driven approach, Natural Language Processing, Morphological Generator, Machine Translation
منابع مشابه
A PCA/ICA based Fetal ECG Extraction from Mother Abdominal Recordings by Means of a Novel Data-driven Approach to Fetal ECG Quality Assessment
Background: Fetal electrocardiography is a developing field that provides valuable information on the fetal health during pregnancy. By early diagnosis and treatment of fetal heart problems, more survival chance is given to the infant.Objective: Here, we extract fetal ECG from maternal abdominal recordings and detect R-peaks in order to recognize fetal heart rate. On the next step, we find a be...
متن کاملOptimal Thermal Unit Commitment Solution integrating Renewable Energy with Generator Outage
The increasing concern of global climate changes, the promotion of renewable energy sources, primarily wind generation, is a welcome move to reduce the pollutant emissions from conventional power plants. Integration of wind power generation with the existing power network is an emerging research field. This paper presents a meta-heuristic algorithm based approach to determine the feasible dispa...
متن کاملExergo-environmental and exergo-economic analyses and multi-criteria optimization of a novel solar-driven CCHP based on Kalina cycle
The present research proposes and optimizes the performance of a novel solar-driven combined cooling, heating, and power (CCHP) Kalina system for two seasons—winter and summer—based on exergy, exergo-economic, and exergo-environmental concepts applying a Non-dominated Sort Genetic Algorithm-II (NSGA-II) technique. Three criteria, i.e. daily exergy efficiency, total product cost rate, and to...
متن کاملExergo-environmental and exergo-economic analyses and multi-criteria optimization of a novel solar-driven CCHP based on Kalina cycle
The present research proposes and optimizes the performance of a novel solar-driven combined cooling, heating, and power (CCHP) Kalina system for two seasons—winter and summer—based on exergy, exergo-economic, and exergo-environmental concepts applying a Non-dominated Sort Genetic Algorithm-II (NSGA-II) technique. Three criteria, i.e. daily exergy efficiency, total product cost rate, and to...
متن کاملExperiments towards a better LVCSR system for tamil
This paper summarizes our latest efforts in the development of a Large Vocabulary Continuous Speech Recognition (LVCSR) system for Tamil at different levels: pronunciation dictionary, language modeling (LM) and front-end. Usually in Tamil there are not many word-pronunciation pairs to train data-driven grapheme-to-phoneme (G2P) converters. Therefore, we explore the correlation between the amoun...
متن کامل